STAT40830 - Adv Data Prog With R (online)

HW Assignment 1

Deepankar Vyas , 23200527

June 6, 2024

Question Statement

Create a neat Quarto document or presentation. In the document/presentation, you must do a plot on a dataset that can be found on any R package and include some text explaining what you plotted. The file does not need to be extensive. If you wish to add some extra text, consider adding an explanation of your approach, some descriptive statistics of the variables used in the graph and a line or two introducing the dataset as part of an introduction.

IRIS Dataset

  • The famous (Fisher’s or Anderson’s) iris data set gives the measurements in centimeters of different variables.
  • The variables are - sepal length and width and petal length and width, respectively, for 150 flowers from each of 3 species of iris.
  • The species are Iris setosa, versicolor, and virginica.

Petal Length Vs Petal Width

Variables’ Overview

Here, we have plotted a scatter plot between Petal length and Petal width, coloured by different species, trying to find some patterns. A quick rundown of the 2 variables in question:-

Summary of Petal Length:

     Var2  Freq
1    Min. 1.000
2 1st Qu. 1.600
3  Median 4.350
4    Mean 3.758
5 3rd Qu. 5.100
6    Max. 6.900

Summary of Petal Width:

     Var2     Freq
1    Min. 0.100000
2 1st Qu. 0.300000
3  Median 1.300000
4    Mean 1.199333
5 3rd Qu. 1.800000
6    Max. 2.500000

A quick analysis of the graph is as follows:-

Analysis

  1. From the graph, there appears to be a strong positive correlation between petal legth and petal width.
  2. Different species of flowers displays different characterisics.
  3. Setosa generally has lower Petal length and Petal width comparatively. Versicolor has higher Petal Length and Petal width compared to Setosa but generally lesser than Virginica. Viriginica generally tends to have the highest Petal legth and width amongst the 3 species considered.
  4. More details about the dataset can be found here

Sepal Length vs Sepal Width - plot

Summary statistics of Sepal Length and Sepal Width

Summary of Sepal Length:

     Var2     Freq
1    Min. 4.300000
2 1st Qu. 5.100000
3  Median 5.800000
4    Mean 5.843333
5 3rd Qu. 6.400000
6    Max. 7.900000

Summary of Sepal Width:

     Var2     Freq
1    Min. 2.000000
2 1st Qu. 2.800000
3  Median 3.000000
4    Mean 3.057333
5 3rd Qu. 3.300000
6    Max. 4.400000